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Application No. 09/785,793 
Amendment dated September 29, 200 3 
Reply to Official Action of July 29, 2003 

REMARKS 

Claims 1-11 remain in this application. 
Rejection under 35 USC 112 

The new claims are directed to a method for detecting and/or purifying only biomolecule 
and/or protein complexes. In this context, the Examiner's attention should be drawn to the 
examples of the application. Example 1 illustrates the purification of protein complexes from yeast. 
Example 2 also explains the purification and detection protein complexes and/or protein subunits 
of a complex from yeast. 

To further support the suitability of the inventive method for the purification and 
identification of biomolecule and/or protein complexes, please find enclosed literature applying the 
inventive method (Gavin et al. (2002), Nature 415, 141-146). In this article, the purification and 
identification of about 200 protein complexes using a "tandem-affinity purification" (TAP), which 
corresponds to the inventive method, is described. This article clearly shows that the inventive 
method can be used for detecting and/ or purifying biomolecule and/or protein complexes. 

Rejection under 35 USC 102 

The subject matter of the present application as defined by the new claims is novel over the 
cited Dar?ins et al. Amended claim 1 is directed to a method for detecting and/or purifying 
biomolecule and/or protein complexes. As to the complexes and/or complex formation, it is 
particularly important that the subunits of the complex be present in their native form. The 
inventive method advantageously enables the generation of highly purified biomolecule and/or 
protein complexes, which exhibit their natural activity and are present in form of their nature 
complexes. To express the biomolecule and/or protein complexes in their native form, the protein 
complex to be purified is preferably expressed in its nature hosts (co-reference: page 6, lines 20-21). 
Thus, the inventive method can be used for detecting and/or purifying biomolecule and/or protein 
complexes from any organism, i.e. also from eukaryotics. Thus, it is guaranteed that the subunits 
respectively protein complexes carry the correct proposed translational modifications which are often 
required for biological activity. 

Darzins et al. describe a method for expressing a desired protein in gram-positive bacteria. 
First of all, it has to be emphasized that the object of Darzins et al. is a production/over-production 
of one desired protein (see page 1, line 9/10; page 23, line 2 and 11). Thus, Darzins et al. fails to 
describe a detection and/or purification of complexes. The method according to Darzins et al. does 
not at all enable the detection and/or purification of complexes as the Darzins-method is exclusively 
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Application No. 09/783,793 
Amendment dated September 29 \ 2003 
Reply to Official Anion of July 29, 2003 

suited for gram-positive bacteria and thus cannot provide the correct post-translational modifications 
which are required for the biological activity of subunits /complexes. 

Rejections under 35 USC 103 

The subject matter of the present invention is not rendered obvious by combining the cited 
Darzins et al. and Zheng et al. either. 

Zheng et al. (Gene 186 (1997, 55-60)) only reports the use of calmodulin-binding protein for 
the purification of proteins over-expressed in E.coli (cf. e.g. page 55, first paragraph under the tide 
"Introduction", page 56, right column under 2.1 as well as page 60, left column, lines 11-12). Further, 
Zheng does not suggest a purification procedure using at least two different affinity purification steps 
but rather the use of a single step using calmodulin affinity chromatography (cf. abstract, lines 2-3). 

The present invention therefore provides a system allowing detection and/ or efficient 
purification of biomolecules and/or proteins expressed at low level, preferably in their natural hosts, 
while maintaining them in functional complexes. It was not known previously that a combination 
of two affinity tags could be used for this purpose. The combination of tags required for this new 
application was not known and previously publications did not reveal that the combination disclosed 
would be successful. 

Applicant respectfully requests that a timely Notice of Allowance be issued in this case. 

The Commissioner is hereby authorized to charge any additional fees which may be required 
in this application to Deposit Account No. 06-1135. 



Respectfully submitted, 
Fitch, Even, Tabin & Flannery 

James P. Krueger 
Registration No. 35,234 

Date: September 29, 2003 

FITCH, EVEN, TABIN & FLANNERY 
120 S. LaSalle St,Suite 1600 
Ch icago, Illinois 60603 
Telephone (312) 577-7000 
Facsimile (312) 577-7007 
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Functional organization of the yeast 
proteome by systematic analysis of 
protein complexes 
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Most cellular processes are carried out by multlproteln complexes. The identification and analysis of their components provides 
insight Into how the ensemble of expressed proteins (proteome) Is organized into functional units. We used tandem-affinity 
purification (TAP) and mass spectrometry fn a large-scale approach to characterize multlproteln complexes in Saccharomyces 
cerevislae. We processed 1,739 genes, including 1 ,143 human ortnologues of relevance to human biology, and purified 589 protein 
assemblies. Bio inform a tic analysis of these assemblies defined 232 distinct multiprotein complexes and proposed new cellular 
roles for 344 proteins, Including 231 proteins with no previous functional annotation. Comparison of yeast and human complexes 
showed that conservation across species extends from single proteins to their molecular environment Our analysis provides an 
outline of the eukaryotic proteome as a network of protein complexes at a level of organization beyond binary Interactions. This 
higher-order map contains fundamental biological Information and offers the context for a more reasoned and informed approach 
to drug discovery. 



A formidable challenge of pcsigeiiomic biology is to understand 
how genetic information result* in the concerted action of gene 
products in time and space to generare function. In medicine, this is 
perhaps best reflected in die numerous disorders based on poly- 
genic traits and the notion that the number of human diseases 
exceeds the number of genes in the genome*. Moreover, the total 
number of human genes does not differ substantially from the 
number of genes of the nematode worm Caenorhabditis €legam % 
suggesting that 'complerit/ may partly rely on the contextual 
combination, of the gene products 2 . Dissecting the genetic and 
biochemical circuitry of a cell is a fundamental problem in biology. 
At the biochemical level, proteins rarely act alone; rather, they 
interact with other proteins to perform particular cellular tasks'. 
These assemblies represent more than the sum of their parts by 
having a new 'function'-. 

Our knowledge regarding the identity of the building elements of 
. specific complexes is limited and is based on selected biochemical 
approaches and genetic analyses. The only comprehensive protein- 
interaction studies are based on ex vivo and in vitro systems, such as 
two-hybrid systems *"* and protein chips 7 , and need to be integrated 
with more-physiological approaches. Whenever it has been possible 
to retrieve and analyse particular cellular protein complies under 
physiological conditions, the insight gained from the analysis has 
been fundamental for the biological understanding of their func- 
tion, and has often taken the analysis well beyoud the limits of 
genetic analysis'***. Prominent examples are the spliccosomc, the 
cycioiame, the proteaaome, the nuclear pore complex and the 
synaptosorue 16 " 1 . No systematic analysis of protein complexes 
from the same cell Lype using the same technique has yet been 
rtporttcL W« have performed a comprehensive analysis of protein 
complexes of baker's yeast, S. cercvisiat, a model system relevant to 
human biology 17 - 1 *. 



Large-scale analysis of protein complexes 

To systematically purify multiprotein complexes, we developed the 
strategy depicted in Fig. 1. Gene- specific cassettes containing the 
TAP dg 1? , generated by polymerase chain reaction (PCR), were 
inserted by homologous recombination at the 3* end of the genes. 
Wc processed 1,739 genes. Including 1,143 genes representing 
eukaryotic orthologucs 3 . Orthologucs arc thought to have evolved 
by vertical descent from a common ancestor 10 and are presumed to 
carry out the same function. For comparison, wc also targeted a 
nonortholo&ous set of 596 genes from chromosomes 1, 2 and 4. To 
test the tagged genes in the absence of the wild-type allele, wc used 
haploid cells. We generated a library of 1.548 yeast strains, of 
which 1,167 expressed the tagged proteins to detectable levels 
(Fig. 1). After growing cells to mid- log phase, assemblies were 
purified from total cellular ry sates by TAP 1 *. This technique com- 
bines a first high-affinity purification, mild eiution using a site- 
specific protease, and a second affinity purification to obtain 
protein complexes with high efficiency and specificity". The puri- 
fied ptotcin assemblies were separated by denaturing gel electro- 
phoresis, individual bands were digested by trypsin, analysed by 
matrix-assisted laser dcsoiprion/ionirauon-time-of-rlight mass 
spectrometry (MALDI-TOF MS) 1 and identified by database 
search algorithms. In all. 293 proteins were localized at membrane? 
(integral and peripherally associated) 22 . Because their purification 
required a separate protocol, only 70 of the membrane- associated 
proteins **cre analysed, of which 40 were purified successfully. We 
analysed the proteins found associated to the 589 (416 ortnologues) 
successfully purified tagged proteins (the Yaw* set of purification a; 
see Supplementary Information Table SI). This generated 20,946 
samples for mass spectrometry and .subsequently identified 1<S,S30 
pro rein*. Of these, 1,440 were distinct gene products, representing 
about 25% of the open reading frames (ORPs) in the genome. The 
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/analysis covers proteins of various subcellular compartment*, sup- 
porting the generality of our approach (Figs lb, 2a). 

Sensitivity, specificity and reliability of the approach 

Of the 589 purified tagged proteins, 78% presented associated 
partners, showing that the method is very efficient for the large- 
scale retrieval and identification of cellular protein complexes. 
There axe several possible reasons why. in die remaining 22%, 
wt were unable to purify and identify interacting proteins. Par- 
ticular proteins may not form any or sufficiently stable or soluble 



complexes. In other cases, the 20K (relative molecular mass M, 
20,000) TAP tag may interfere with complex assembly or protein 
localization and function. Because wc used haploid cells, we were 
able to score for vjabiliry. In 18% of the cases when essential genes 
were tagged, we did not obtain viable strains, confirming that 
carboacy-terminal tagging can impiiz protein function- Further- 
more, the method may fail to detect transient interactions, low 
stoichiometric complexes, and/or those interactions occurring only 
in specific physiological states not present or under-represented in 
exponentially growing cells. In addition, die sire distribution of the 
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identified proteins reveals a dear technical bias against proteins 
below lSK (see Supplementary Information Fig. SI). However, in 
t. about 30% of the cases where we failed to purify complexes around a 
given protein, the protein was detected when other complex 
components (entry points) were togged and purified. 

To assess the quality of the results obtained for purifications 
that do contain associated proteins, we compared our data to the 
.literature (see below). We also established reprodu ability of the 
approach by purifying 13 large complexes at lease cwice. The 
probability of detecting the same protein in two different puri- 
fications from the same entry point is about 70%. Therefore, on 
average, 30% of all detected associations presented in this study 
need to be treated wirh caution. This is pardcularJy the case if 
complexes were retrieved from only one entry point This varia- 
bility may be inherent to the technique (biological samples, puri- 
fication, mass spectrometry) but also to the large-scale nature of 
the approach, tuned to high complex coverage and high sensitivity. 

To determine the experimental background, we purified mock- 
transformed control strains, lending to the identification of 17 
contaminant proteins, mainly heat-shock and ribosomal proteins 
(see Supplementary Information Table Si). These are highly 
expressed proteins 2 -*. Because these proteins appeared in more 
than 20 of our purifications (3.5%), we decided to use this cutoff 
point and pragmatically excluded another 49 proteins present at 
equal or higher frequencies in our screen from further analysis (see 
Supplementary Information Table S2). It is prudent to interpret 
data concerning often-purified proteins just below this cutoff point 
with caution. Finally, we cannot exclude the occasional artificial 
interaction generated during cell lysis. 

Although stoichiomctry was not assessed in this study; wc have 
observed that proLeins belonging to the contaminant list generally 
comprise the weak bands after staining, and typically are identified 
by fewer trypuc peptides. Moreover, the systematic cutting and 
analysis of gel slices led to the identification by mass spectrometry 
even of invisible proteins. The sensitivity of the TAP MS method is 
high, because w C were able to identify proteins present at 15 copies 
per cell (data not shown). Identified proteins w;re S.6K to 55 9K in 
size and ranged in pi between 3.9 and 12.4 (see full evaluation in 
Supplementary Information Fig. Si). 

Organization of the purified assemblies into complexss 

On the basis of sub^nadal overlaps, we grouped the biochemical 
purifications obtained with 589 different entry points into a 
, reduced number of biologically meaningful complexes. A total of 
243 purifications corresponded to 98 known uonxedundant mulci- 
protein complexes present in the yeast protein database ( YPD; 60% 
of the data sec)* 2 . A further 242 purifications were assembled into 
134 new complexes. The remaining 102 proteins showed no 
detectable association with other proteins when purified directly, 
or as part of other complexes. The subsequent statistical analysis is 
based on a list that includes 232 annotated 'TAP complexes' (see 
Supplementary Information Table S3). 

: Among the complexes that were assigned to the known Y/PD 
complexes, coverage of components was very high (Fig. 2b). Of all 
232 TAP complexes, only 9% had no novel demen: (Fig. 2c). The 
size of the TAP complexes varied from 2 to 63 components, with an 
average of 12 components per complex (Fig. 2d). Wc assigned 
cellular roles to completes by computing functional assignments of 
the individual component* according to YPD 51 and by literature 
mining (Fig. 2c, Supplementary Information Table S3). In general 
term*, there leemed to be a wfd« functional distribution of com- 
plexes over nine categories (Fig. 2e). Of the 304 proteinj with no 
YPD functional anaouiioa that were identified in our screen, we 
propose roles for 231 (Supplementary Information Tabic S3). 
Moreover, for U3 proteins that had a functional annotation, w« 
discovered a new molecular context (Supplementary Information 
Tables S3, $4; sec also examples below). 



Cohesive and dynamic complexes 

A particular complex is not necessarily of invariable composition 
nor are all its building blocks uniquely associated with that specific 
complex, Wjth several distinct tagged proteins as entry poinis to 
purify a complex, core components can be identified and validated, 
whereas more dynamic, perhaps regulatory components may be 
present differentially. The dynamics of complex composition axe 
well illustrated by the cellular signalling complexes formed around 
the protein phosphatase 2A (PP2A; yeast TAP-C151; see Supple- 
mentary Information Table S3). Tagging different known PP2A 
components resulted in the purification of the known trimeric 
complexes containing Tpd3 (the regulatory A subunit), either of the 
two catalytic subunits, Pph21 and Pph22, and either of the two 
regulatory B rub units, Cdc55 and Rtsl. The Cdc3 5- containing 
complexes were found to additionally contain Zdsl or Zds2, 
known cell-cycle regulators, revealing preferences among the dif- 
ferent complexes and a link to cell-cyde checkpoints. Additional 
plasticity of the PP2A complexes U apparent by the interaction with 
three proteins implicated in bud shape and morphogenesis (Ltd, 
Kcll and YBL104C). This analysis also shows mat tic interactions of 
a signalling enzyme maybe sufficiently strong to allow the detection 
of distinct cellular complexes and thus be diagnosric for a role of this 
enzyme in different cellular activities. 

An example of a large, cohesive complex is given by die poly- 
adenytation machinery, which is responsible for the sequential 
steps necessary for cukaryotic messenger RNa cleavage and 
polyadenylation" (yeast TAP-CI62; sec Supplementary Informa- 
tion Table S3). Using Ptal as the entry point, we identified 1 2 of the 
13 known intcractors and 7 new components (Fig. 3a). The 




Figure 3 Primary validation cf complex composition by 'reverie' purification: tfie 
por/adenyladon machinery, a, a similar banc* pattern is oosarvBd wnsn dilicrent 
components of the poiyaoenyiadon machinery complex are used as entry points for affinity 
purification. Uideriined are new csmponenis of lha poiyadanytaHon maon'ruy complex 
for wnlcf, a physical association has- not yet been described. Tha bands or trie Lagged 
pr=lains are indicated fiy arrowheads, b. Proposed model of the pofyaienyiation 
machinery. 
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composition of the complex was validated by extensive 'reverse' 
; analysis, including purifications obtained with Y1GL059C and 

two of the new interactors (Fig. 3a). Four new components — Pdl, 
. YKL018W, YKL059C and YOIU79C — had no previous functional 
. annotation, but could be included into a hypothetical model by 
. biomformaitic analysis (Fig. 3b). Ssu72, another of the seven new 
interactors, had previously been reported to interact with TFI1B, 
strongly supporting a suspected link between ihe poiyadenylation 
machinery and transcription. Thus, complexes are often sufticieady 
strong to show high composition integrity even when purified with 
' different entry points. 

A higher-order organization map of the p rote o me 
After assigning individual pro trim to protein complexes, wc inves- 
tigated relationships between complexes to understand the integra- 
tion and coordination of cellular functions. We represented 
relationship by linking complexes that share components (Fig. 4). 
By p Lot ting all the relationships, we obtained a network of com- 
plexes. Connections in this network not only reflect physical 
interaction of complexes, but may also represent common regula- 
tion, localization, turnover or architecture. Most complexes arc 
linked. The more connected a complex, the more central its position 



in the network. Complexes composed of at least 50% orthologues 
arc shown as double sized, and complexes are colour-coded accord- 
ing to cellular roles. Several complexes belonging Co the same clxss 
appear to group, suggesting that sharing of components reflects 
functional relationships. These reUtionships arc best observed with 
complexes involved in mRNA metabolism (orange), cell cycle (red), 
protein synthesis and turnover (light green), and intermediate and 
energy metabolism (violet). Complexes involved in transport of 
protein or RNX (pink), in contrast, appear more dispersed and have 
connections to complexes of alJ other cellular roles. There arc several 
'satellite* complexes that do not seem to share components. Because 
this analysis is not exhaustive, we expect more connections for some 
of these complexes as more are purified and analysed. A software 
package (available at htrp://yeas Lcellzome.com) allows the naviga- 
tion of this proteome map at both the protein and complex level 
Such a tool is essential to allow for proper data interpretation and 
for the generation of hypotheses leading to further experimental 
investigations. 

Parallel analysis of human and yeast complexes 

Orthologous gene products are thought to be responsible for 
essential cellular activities. We found that orthologous proteins 




ftflin 4Thi protein complex network, grid grouping of connected campteJiBS. Line were 
cataDJsnca between compleies snaring ac least one protein, ftr darity. p rare ins found In 
more than nine complexes were omitted, lha era pins ware gens rated autprnaticalr/ by a 
refaction algorithm trial finds a locaJ minimum in the distribution of nodes by minim tang 
me distance of connected nodes and mzxJmizJng distance of unconnected nodes. In Oe 
vpp* panal. esilular rales of lha individual eamplanas (ascribed in St^ptamanlary 
Inftrmaton Table So) am eclcur ceded: red. ceil cycle; dart green, signalling;! dark blue, 



transcn'pUan, DMA maints nance, enrornatin scrucure; pink, protein and nNA transport 
orange, RNA metabolism; light green, protein synthesis and turnover; brown, call po tartly 
and sTucura; violet, intermediate and energy metabolism; Ucht blue, membrane 
btocenesis and traffic. The lower panttl is an aiampla of a complex (ycastTAP-C212) 
linked to two otter complexes (yeas: TAP-C77 and TAP-CHO) by shared components, 
it illustrates mo connection between me prolan and complex levels of organization, Red 
lines indicate physical infractions as fated 'm YPD a . 
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preferentially interact with complexes enriched with other ortho« 
logues (mean 53%; Fig. 2f). In comparison, nonorthologous 
' proteins have a lower propensity for such interaction (mean 
31%). Similarly, the likelihood to interact with essential gene 
products is higher for essentia! (44%) than for nonessential 
(17%) proteins. This supports the existence of an 'orthologous 
. proreome' that may represent core functions for the cukaryocie 
cell 1 ". To determine whether the TAP strategy can be applied to 
retrieve equivalent rnukiprotein complexes from yeast and human 
, ceils, we compared different complexes from distinct subcellular 
compartment*: Arp2/3, a cytoskeleton-associated complex, Ccr4- 
Not, a nuclear assembly, and Trapp, a Golgi-associatcd complex. 
The Arp2/3 complex is a stable multiprotein assembly required for 
the uu dead on of actin filaments in all eukaryotic cells and consists 
of seven proteins in human and yeastr*. TAP of ArpZ in yeast (X\P- 
C153) and AIU?C2 in human 11 resulted in the isolation and 
identification of all known components. This indicates that the 
TAP approach combined with liquid chromatography coupled to 
tandem masi spectrometry (LG/MS/MS) is an efficient and sensitive 
method for the retrieval and characterization of human multt- 
procein complexes (Fig. 5a). 

The yeast Ccr4-Not complex (TAP-C149) is involved in the 
control of gene expression and consists of eight components 2 *. 
Potential human orthologucs have been identified and character- 
ized for yeast Not2, Not3» No t4 and Cafl. bur not for yeast Motl and 
Cart. Moreover, no respective multiprotein complex has yet been 
described in mammalian cells 27 . TAP of tagged human NOT2 
resulted in the identification of a multiprotein complex consisting 
of human KOT2, CAF1 and CALIF, and two functionally non- 
annotated gene products, encoded by KIAA1Q07 and KIAA1194, 
which we could assign as the onhologues of yeast Notl and Ccr4, 
respectively (Fig. Sb). Purification of tagged yeast Ccr4 resulted in 
the identification of a complex component, Caf40 ( that has an 
orthologouB counterpart, Rqcdl, also identified in the human 
complex. These data strongly smggest that the human and yeast 
Ccr4— Not complexes are comparable in sub unit composition. 

As a third example wc purified and characterized an ortholo- 
gous hurnar. TRAPP (transport protein particle) complex. The 
yeast complex contains ten subunits that arc required for docking 
of transport vesicles derived from the endoplasmic reticulum to 
the cu-Golgi (yeast TAP-CIOZ)^. The human complex had been 
purified previously as an assembly of about 670K; however, apart 
from human BET3 and TRS20, none of the other complex 
subunits had been identified 3 *. TAP purification of tagged 
human BET3 resulted in the identification of a complex consisting 
of human B£T3, MUM2, R326U_2, Sediin, EHOC-1, PTD009 
and KIAA1012, which we assigned as the orthologucs of yeast Beta, 
Ber5, Trs33, Trs20, Trsl30 # TrsX3 and TrsS5, respectively (Fig. 5c). 
Taken together, these examples show that the analysis of yeast 
complexes can ofcen predict the composition of the human 
counterparts. This large-scale yeast proicomc analysis could have 
immediate functional implications for human biology. . 

Discussion 

To assign cellular functions to new, nonannotated gene products, 
and to understand the context in which protcina operate, several 
large-scale approaches have been undertaken. These include mon- 
itoring of xn&NA expression (chips and serial analysis of gene 
expression (SAGE)) M , loss- of- function approaches combined with 
subcellular localization screens (in ycasr ?J0 , RNA-mediatcd inter- 
ference in C c/ej*d«s 1,J \ gene trap in mice"-*"*), computational in 
silica methods (protein fusions, gene neighbouring, structural 
predictions) -3 "*, and extensive two-hybrid sere ens w and protein 
chip analysis 7 - 37 . The TAP/MS-based functional proteomics 
approach presented here may well constitute the largest analysis 
of protein compleaces to date. We confirmed the expression of 1/MO 
ORPs as annotated in the genome, of which 59 had been assigned 



only as hypothetical. A large-scale analysis of yeast proteins per- 
formed previously on a crude cell extract identified 1 ,484 different 
proteins from exponentially growing cells 3 *- Another 714 proteins 
were detected in our study. This raises the total n Limber of 
ascertained proteomc components to 2,210. 

TAP J * proved invaluable for the purification of complexes from 
different cellular compartments, including complexes associated 
with cellular membrane The approach also allows for the efficient 
identification of low-abundance proteins that would not be detect- 
able by approaches involving expression proteomics 9 - 1 . Further, 
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Fieura 5 Proutin complexes have a similar composition in yeast and human. Comparison 
of three TAP protein comptexas isolated from human and yeast cells. AH orDclogous pairs 
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human is lately consBfved. Cocmassie-s&ined gets are shown only for the human 
purfflcabons. a. Arp2/3 complex b. Ccr4-Nol2 compfei; e, Trapp complex. Hyp. protein, 
hypothetical protein. 
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TAP allows xhc puxLficatioD of very large complexes. For example, 
we were able to purify yeast TaP-C1 16 (inoSO). a complex reported 
to have an M, abou; TM-1.5M (ref 39). and we identified all kapwa 
and several new interactors. 

"We soughc 10 gauge zhc reliability of our daca by comparing the 
experimental result* to the literaairc. Although comparison of olit 
data to the YPD complex database is straightforward (Fig. 2b), it is 
very difficult to derive meaningful comparative information from 
the yeast two -hybrid data because the hirer produces binary 
interactions, whereas the TAP/MS method yields complex compo- 
sition data. When considering all possible interactions between 
prorcios within a complex or a purification, normalized for the 
proteins that were identified in this study, we find that our dara 
overlap wixh only 7% of the interactions seen by yeast two-hybrid 
assays*"*. When compared with the YPD protein complexes, this 
study covers 56%, whereas large-scale yeast two-hybrid approaches 
coyer 10%. There are two reasons for this difference. On one hand, 
the yeast two-hybrid approaches touch only 35% of the described 
complexes, compared with 60% in this study. On the other hand, 
there is a generally lower coverage of individual components within 
complexes. The figure for the yc*ert two-hybrid data is 39% com- 
pared with 5<5% in this study. To achieve the respective coverage, the 
yea.it two-hybrid approaches required processing of 95% of all yeast 
ORFs compared wich 25% in our study. Altogether, this illustrates 
ihat the two methodologies address different aspects of protein 
interaciioa. Comprehensive two-hybrid approaches do not seem to 
be particularly suited for characterization of protein complexes. 
This supports the view that complex formation is more than the 
sum of binary interactions. However, rwo-hybrid analysis is of 
exceptional value for the detection of pairwise and transient 
associations. The success of the TAP/MS approach for the char- 
acterization of protein complexes relics on the conditions used for 
live assembly and retrieval of the complexes. These include main- 
taining protein concentration, localization and post-translational 
modifications in a manner that closely approximates normal 
physiology. Because the TAP/MS method docs not provide infor- 
mation on The orientation of complex components, complex 
characterization and yeast two-hybrid analysis are ideally comple- 
mentary. 

Another outcome of our study is the case and frequency by which 
protein complexes can be retrieved from cells. These biophysical 
properties of protein complexes may suggest cooperative binding. 
Bridging factors, post-translarional modifications, all ost eric struc- 
tural changes, and binding of ions and metabolites can alj cooperate 
to increase the number of short-range intcracrions berween individ- 
ual proteins in an assembly. Moreover, several proteins with 
critical regulatory functions axe non-gJobuIar or intrinsically 
unstructured* 0 , folding into ordered structures occurs only on 
binding co other proteins, offering the opportunity of control 
over the thermodynamics of the binding process. We anticipate 
that some of the complexes identified in this study will be useful for 
structural studies. 

. Although we used only one set of experimental pmxnctcrt here 
to grow and maintain cells for the evaluation of complex composi- 
tion, we will, hi the future, systematically modify experimental 
parameters to evaluate the impact of a changing environment on 
complex integriry 11 . These studies should help to elucidare the 
dynamics of complex assembly and disassembly. Moreover, the 
Strains generated can be used for parallel assessment of protein 
expression levels. Finally, tindem-affinicy-purified complexes from 
mis collection may be a starting point to develop protein chips 
containing physiological protein complexes 7 and for assessment of 
biochemical activity of proteins whhin their molecular 
environm ent^. 

The statistical analysis of the largc-icale yeast approach shows a 
clear tendency of proteins that axe part of the sec of metazoan 
orchologuc* to bind to other proteins of the same set. Moreover, wc 



also observed a propensity to associate among the products of 
essential genes. Complexes containing orthologues and essential 
proteins overlap signincantJy. This recalls the proposition that the 
products of essential genes are also more likely to represent central 
components in & protein nerwork°. Together, this raises the possi- 
bility that o/thologuc complexes represent the building blocks of a 
eukaryotic 'core proreome' covering basic cellular function* 11 ** 3 . We 
believe that a significant number of the yeast complexes described 
here will have human equivalents and expect that these may form 
the basis for understanding multifactorial diseases. Through the 
guilt by association concept, we are able to propose cellular roles 
for proteins that had no previous functional annotation and new 
roles for known proteins. Assessment of the physiological molecular 
context of proteins, as described here, may be one of the most 
efficient and unambiguous routes lowards the assignment of gene 
identity and function. 

Our analysis allowed us to group cellular proteins into about 200 
complexes. These complexes axe connected to each other by shared 
components. The network that resulted is a functional description 
of the eukaxyoiic proteome at a higher level of organization. Such 
higher-order maps will bring an increasing quality to our apprecia- 
tion of biological systems. It is expected that this may provide drug 
discovery programmes with a molecular context for the choice and 
evaluation of drug targets. q 

Methods . 

Yeast strain construction and TAP 

Yea* tiniru screwing TAP-r^ed OUT, w^ e comimacd in a *emi- automated «*ay 
eaenually At done previoosh/- " l . Ceil, w« e cultured at 30 X in YT>D medium, collected 
daring cxponxrriClil CTO^dv ond mechanically widi glau beads. Purification* were 
done a, dMiboI 1 "'. For the punncztior. ufmcnbmtt proteins, the detexjent. cmeen- 
eraaon adjusted to 1.5% after h/yit. 



TAP from human ccfis 

Rwravira! induction vecori were general by directional doain B o/PCR-unplifled 
ORPi inio > modJied v C «ion of a MnMLV.fcured ««Lor via cat Gateway rite-specific 
recombination tyitcui (life Technologies). Por NOT2 ud ARPCZ. the TAP aauerr- wai 
fiucd u> tie amino terminus and for QCT3 to the C terminm. In ail a -arrr, c*fl »!* mr ^ 
were ffencrated byrerrwr^mediacei s*n« uanacr and complexes we punned after cell 
expansion and cultivation for ac least 5 day* by a modified TAP protocol'*. 

High-throughput protein identification 

Puri/ied prowins *er« cunoentratcd, leparated on 4-12% NuPaGE (eb (Novex) and 
stained widi colloidal Cooouum blue. Cell were diced inro 1 .25 .mm band* icrois chc 
entire separation range of each Line to simple all potential mmcdi^ proteins without 
biu w,ch r«sp« to *«« and relative abundance Cur band* were digested with Cryujiu 
ciacna'aUy as described - . The resulting irypde peptide mixture* ware analy»«d by 
autoiMtcd MALDI-TOF MS (Vojo^ DE-STR. Applied Bior/srermj. Protein, were 
identified by automated pepridemaM SaserpriniinK using the software Tool ICnexux 
(rroteomctrici) and an in-hoo** built ttc.uer.ee database of £ mwjiae protein*. 
£xpcr Uttcnw with human orrtoiojna of yeau proteins w« aubjecccd to the lime curani 
and digestion procedure but ? n>t«in idenancauon wu accomplished by luromared 
UXhiSMS ni»ly*u (Ulri^ar.. LC Packing. QlOfi. Microniaa.) in conjunction with 
searches of the GenPept database UV^f^cbi J iImj^ 5 ov/ C enb*nWa«np«pE.ia. a x) 
uanz the software sool Maxco: (MairU Science). 



BiomfDrmatics 

functional and Iooliwrion inibrmarfon about T ea«t proteins w« retrieval from the YPD 
regaled m Augiur 2001 . To jctamore corwi« dusuiCatJoa forlooiizjuon and Amcrion. 
YPD dtiact were rneTged. Par anajysia of aaaeobled Cumplctxi from our purification* 
W she puMiihcd complexes, we ft=wv«il i^J^d^y U**i file YTD 4*U act manuaiiy 
Protein domain snah^i* was pertbrmed i«0i SMARr'. PtifiLm M waj cited for homology 
Mahw. AU addinonaJ anaJyaix xofiware was developed bj ut. using Perl and Python, for 
die companion o/porifieadon. to tK r vi»n ^^pl^.^. -ch ««o,b«e*/a eomola «id 
mdepcttdciitlK, *>f ^ c portfexdnnx. waa considered lo be c'onnccred to och other. 
Rcduadint tnrer»erio*« we meried. F«r the eumparuoii widi yeasr two-hybrid aiaaya w« 
eounred ev«y de*cribe<i binary £nlezu«tion Included in oor pi»nftcarian* and meiuded in 
the Y?D complete*, ndpccdvcly. For calculation of enveraee of coapl«a«, -x cuoiidercd 
only the YPD cooplcxor *« include « !ea»t two idendSed components ££rom yc*it rwo- 
brbrld ImcMCtion. or from complae purineaawO. For Use calculation of cov.rV« „f 
«aipla componena. aa purtacaUons or j«.t :wa-l, r brld pairs with .n entry point or 
ban ailing inu> a dtfenbed oxnpicx were sontidend. 
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